Assignment 04: Data Visualization — ggplot2 and Beyond
Author
Affiliation
Alona Nazarenko
Kyiv School of Economics
Data on Vehicles for 2025 🚗
As I’ve already visualized my previous dataset about freedom, I was looking for something new and interesting. During a lecture, Ihor Miroshnychenko showed us a website with open data for New York 🌆. That got me curious, so I searched the web and found a similar site for Ukraine 🇺🇦.
I easily found a dataset covering vehicles from 01.01.2025 up to the most recent data 30.09.2025. You can check it out on the DIA portal 📊.
This dataset is very large, containing 1,641,470 rows and 22 columns:
For my analysis, having only the region code wasn’t enough — I also needed the region name.
So, I used web scraping to create a table that decodes the region codes from this site.
And also a big pain was the letter “І”, because it looks almost identical to the Latin “I” in some fonts, which caused issues when matching or merging data. I had to carefully normalize all the text to make sure the Ukrainian letters were correctly recognized.
Insights & Graphs 📊
The first thing I was curious about was which car brands Ukrainians prefer the most. To my surprise, it is Volkswagen.
I also did a deeper analysis and found that the most popular car model is the Volkswagen Passat 2012, with a price range of about $9,000–$10,000. This may be influenced by the average Ukrainian salary and general affordability, which makes cars like the Volkswagen Passat 2012 more accessible.
Volkswagen Passat 2012
Next Prey 🐾
My next prey was the distribution of car types prowling our roads.
Code
year_graph <- car_data |>filter((category_oper =="Registration"| category_oper =="Re_registration")) |>mutate(kind =fct_reorder(kind, make_year, .fun = median)) |>ggplot(aes(x =reorder(kind, make_year, median), y = make_year, colour = kind)) +geom_boxplot(outlier.alpha =0.5) +scale_y_continuous(breaks =seq(min(car_data$make_year), max(car_data$make_year), by =10) )+scale_colour_manual(values =c("#000000", "#1F5B89", "#337AB7","#F25C05", "#FF9F1C", "#004D4D", "#009999", "#FF4FA3", "#C71585" ) )+labs(title ="Distribution of Vehicle Production Years by Type in Ukraine🇺🇦",x ="Type of Vehicle",y ="Year of Manufacture" ) +theme_minimal(base_size =14) +theme(axis.text.x =element_text(angle =45, hjust =1),plot.title =element_text(face ="bold", hjust =0.5),legend.position ="none" )ggplotly(year_graph)
Most cars are up to 20 years old 🚗. The preference for older vehicles likely reflects affordability 💰, highlighting the importance of increasing income levels 📊.
A few unusual outliers caught my eye—cars over 1000 years old 🕰️🚘—which I couldn’t resist exploring 🔎.
Table 1: Outliers
brand
model
make_year
color
purpose
fuel
vin
number_obl
ЗИЛ
130
1900
ЗЕЛЕНИЙ
СПЕЦІАЛЬНИЙ
БЕНЗИН АБО ГАЗ
00000000000000146
NA
ЗИЛ
130
1900
ЗЕЛЕНИЙ
СПЕЦІАЛЬНИЙ
БЕНЗИН
1021382
NA
HONDA
LEAD
1900
СІРИЙ
ЗАГАЛЬНИЙ
БЕНЗИН
JH1000AF481009931
ВА
ГАЗ
3307
1900
СИНІЙ
ЗАГАЛЬНИЙ
БЕНЗИН
XTH330700P1452790
NA
АЗЛК
2140
1900
СИНІЙ
ЗАГАЛЬНИЙ
БЕНЗИН
NA
NA
PACKARD
180
1900
ЧОРНИЙ
ЗАГАЛЬНИЙ
БЕНЗИН
128213520
СЕ
SUZUKI
LETS
1900
ЧЕРВОНИЙ
ЗАГАЛЬНИЙ
БЕНЗИН
JS1000CA1KA129840
NA
PACKARD
180
1900
ЧОРНИЙ
ЗАГАЛЬНИЙ
БЕНЗИН
128213520
СЕ
Packard 180
From the next graph, we can clearly see that in 2025, most people are expected to register cars manufactured in 2008 🚗 — the year that stands out as the peak of registrations.
Warning: Removed 3276 rows containing non-finite outside the scale range
(`stat_density()`).
🌱 Ecology
I also decided to explore how eco-friendly our country is this year 🌍.
Below is a map of Ukraine 🇺🇦 showing the number of electric ⚡ cars in each region.
Code
electro_graph <-ggplot(ukraine_sf) +geom_sf(aes(fill = electro,text =glue("Region: {NAME_1} Electric: {electro} Total registrations: {total} Share: {round(electro / total * 100, 1)}%") ), color ="black") +scale_fill_gradientn(colours =c("#F7FEE7", "#A3E635", "#65A30D", "#3F6212", "#1A2E05"),na.value ="grey90",name ="Electric Cars" ) +theme_minimal(base_size =14) +labs(title =glue("\U000026A1 Electric Mobility Across Ukraine")) +theme(plot.title =element_text(hjust =0.5, face ="bold", color ="#14532D"),plot.caption =element_text(hjust =0.5, face ="italic", color ="#14532D"),panel.background =element_rect(fill ="#F8FFF8", color =NA) )
Warning in layer_sf(geom = GeomSf, data = data, mapping = mapping, stat = stat,
: Ignoring unknown aesthetics: text